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Abstract 

Background: Although many hyperthermophilic endoglucanases have been reported from archaea and bacteria, a 
complete survey and classification of all sequences in these species from disparate evolutionary groups, and the 
relationship between their molecular structures and functions are lacking. The completion of several high-quality 
gene or genome sequencing projects provided us with the unique opportunity to make a complete assessment 
and thorough comparative analysis of the hyperthermophilic endoglucanases encoded in archaea and bacteria. 

Results: Structure alignment of the 19 hyperthermophilic endoglucanases from archaea and bacteria which grow 
above SOT revealed that GlySO, Pro63, Pro83, Trpl 15, GlulSl, MetlSS, Trpl35, Trpl75, Gly227 and Glu229 are 
conserved amino acid residues. In addition, the average percentage composition of residues cysteine and histidine 
of 19 endoglucanases is only 0.28 and 0.74 while it is high in thermophilic or mesophilic one. It can be inferred 
from the nodes that there is a close relationship among the 19 protein from hyperthermophilic bacteria and 
archaea based on phylogenetic analysis. Among these conserved amino acid residues, as far as Cell2B concerned, 
two Glu residues might be the catalytic nucleophile and proton donor, GlySO, Pro63, Pro83 and Gly227 residues 
might be necessary to the thermostability of protein, and Trpl 15, MetlSS, TrplS5, Trpl 75 residues is related to the 
binding of substrate. Site-directed mutagenesis results reveal that Pro6S and Pro8S contribute to the thermostability 
of Cell 2B and MetlSS is confirmed to have role in enhancing the binding of substrate. 

Conclusions: The conserved acids have been shown great importance to maintain the structure, thermostability, as 
well as the similarity of the enzymatic properties of those proteins. We have made clear the function of these 
conserved amino acid residues in Cell2B protein, which is helpful in analyzing other undetailed molecular structure 
and transforming them with site directed mutagenesis, as well as providing the theoretical basis for degrading 
cellulose from woody and herbaceous plants. 
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Background 

Cellulose is the most abundant organic compound and 
renewable carbon resource on earth [1]. Biodegradation 
of cellulose, an abundant plant polysaccharide, is a com- 
plex process that requires the coordinate action of three 
enzymes, among which endoglucanases (EC 3.2.1.4), are 
able to break the internal bonds of cellulose, and disrupt 
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its crystalline structure, exposing the individual cellulose 
polysaccharide chains, playing in most important role 
[2-4]. The degradation is mainly carried out by bacteria, 
fungi, and protozoa, commensals in the guts of herbivor- 
ous animals, as well as the termite Reticulitermes speratus 
[5], from which, there are variety of endoglucanases. The 
complex chemical nature and heterogeneity of cellulose 
account for the multiplicity of endoglucanases produced 
by microorganisms. The activity of different endogluca- 
nases with subtle differences in substrate specificity and 
mode of action contributes to improvement of the degrad- 
ation of plant cellulose in natural habitats. There are four- 
teen families of glycoside hydrolases (GHF) that are used 
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for cellulose hydrolysis [6]. More and more extremophiles 
have been studied in recent years, especially the hyper- 
thermophilic enzymes. Based on amino acid sequence 
homologies and hydrophobic cluster analysis, hyperther- 
mophilic endoglucanases obtained from extremophiles, 
which are widely distributed in terrestrial and marine 
hydrothermal areas, as well as in deep subsurface oil 
reservoirs, have been classified into GHF12 [7-14]. As 
described above, there are hyperthermophilic endoglu- 
canases from archaea, most of which were chosen for 
sequencing on the basis of their physiology [15]. In 
addition, many hyperthermophilic endoglucanases gene 
which have been cloned were found in some heat-tolerant 
bacteria [16]. Those hyperthermophilic endoglucanases 
have a common feature that the amino acid sequences are 
mostly relatively short (less than 400 amino acid residues). 

Although many hyperthermophilic endoglucanases of 
GHF12 amino acids have been reported from archaea 
and bacteria, a complete survey and classification of all 
sequences in these species from disparate evolutionary 
groups, and the relationship between their molecular 
structures and functions are lacking. The completion of 
several high-quality gene or genome sequencing projects 
provided us with the unique opportunity to make an un- 
precedented assessment and thorough comparative ana- 
lysis of the hyperthermophilic endoglucanases encoded 
in archaea and bacteria. The analysis of the full set of 
hyperthermophilic endoglucanases genes in genomes 
from diverse species allows a definitive classification of 
hyperthermophilic endoglucanases and an assessment of 
their origins, evolutionary relations, patterns of differen- 
tiation, and proliferation in the various phylogenetic 
groups. We are interested in finding answers to the fol- 
lowing questions: 1) What are the evolutionary relations 
among these hyperthermophilic endoglucanases?; 2) What 
is the common feature between these conserved amino 
acid residues and 3D topological structure?; 3) What the 
mechanism of the heat tolerance among these hyperther- 
mophilic endoglucanases? 

The broad analysis in this study provided a compre- 
hensive classification scheme and proposed a molecular 
structure applicable to all hyperthermophilic endogluca- 
nases. A clear picture of the patterns of endoglucanases 
classes in different species groups was provided. We 
identified and classified in this study a higher number of 
hyperthermophilic endoglucanase amino acids from the 
GHF12 than previously reported, allowing us to identify 
their relationships based on the phylogenetic clustering. 
We found that, similar to archaea, amino acids from hy- 
perthermophilic bacteria are also quite different from the 
other sequences in GHF12. We characterized several con- 
served amino acid sites from these endoglucanases and 
predicted their functionality based on the amino acids 
similarity among the proteins available in databases. The 



resulting rich data set of hyperthermophilic endogluca- 
nases from GHF12, comprising 19 sequences, is available 
downloaded from NCBI (Table 1). 

Results 

Protein sequences characteristics 

GenBank has grown fast in recent years and offer us 
with much better taxonomic sampling for such BLAST- 
based analysis [17]. We performed similar BL AST-based 
analysis for the 19 thermophilic endoglucanase protein 
sequences (which included the T. maritima endogluca- 
nase sequences), using the nonredundant (nr) database 
as a reference and recording highest ranking matches. 
We also searched endoglucanase sequences in several 
plants, bacteria, fungi and algae sequences including the 
sequences of the R. speratus, using the protein BLAST 
search engine with a variety of endoglucanase amino 
acid sequences as queries for most of the thermophilic 
endoglucanase, else using endoglucanase as a keyword 
for searching other amino acid sequences of endogluca- 
nase (Table 1). In most cases, whenever significant simi- 
larity to an endoglucanase sequence was identified, the 
amino acid sequence was excised and homology based 
protein predictions were performed using the most simi- 
lar query as a guide. All of these 40 protein sequences 
range from 252 to 438 amino acid residues in length. Of 
these sequences, those from archaea and bacteria showed 
similar lengths, especially for those 19 thermophilic endo- 
glucanase protein sequences where the average percentage 
composition of the residues cysteine and histidine is only 
0.28 and 0.74, which are less frequent in thermophilic pro- 
teins according to the statistics of amino acid composition 
based on MEGA 5 (Table 2). 

Phylogenetic analysis 

Phylogenetic analysis based on the Maximum-parsimony 
(MP) and Neighbour-joining (NJ) procedure implemented 
in PAUP 4.0 [18] and other approaches (see Materials and 
Methods), indicated that all endoglucanase proteins can 
be reliably grouped into 3 distinct classes except for the 
outgroup R. speratuSy which belongs to the insect family 
(Figure 1). Furthermore, from the multiple sequence align- 
ments, the hyperthermophilic endoglucanase proteins be- 
long to the class I, and others belong to class II and III. 
No obvious differentiations are implied in these 19 protein 
sequences. It was not surprising that there was a close 
relationship among 19 protein sequences from bacteria 
and archaea supported with good bootstrap values 
based on Maximum-likelihood (ML) tree by using 
MEGA 5 (Figure 2). It was inferred that the endogluca- 
nases of Dictyoglomus turgidum, Thermotoga naphthophila 
and Thermotoga maritima which are currently studied in 
our research group are closely related compared to the 
others, although the identity of the amino acid sequences 
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Table 1 The phylogenetic distribution of endoglucanases 
from glycoside hydrolase family 12 





Organism 


Length 


^GenBank 
number 


Euryarchaeota 


Acidilobus saccharovorans 


396 


ADL19785 




Ignisphaera aggregans 


360 


ADM27702 




Metollosphoero cuprino 


326 


AEB95090 




Pyrococcus furiosus 


319 


AAD54602 




Sulfolobus acidocaldarius 


311 


AAY81158 




Sulfolobus islondicus 


332 


ADX81754 




Sulfolobus islandicus 


334 


ACP37717 




Sulfolobus islandicus 


334 


ACR41545 




Sulfolobus islondicus 


334 


ADX84872 




Sulfolobus solfotoricus 


334 


AAK42142 




Thernnococcus sp. 


319 


EEB73588 




Thernnoproteus ten ox 


263 


CCC81038 




Thernnoproteus uzoniensis 


252 


AEA12777 




Vulconisoeto distributo 


330 


ADN509821 


Bacteria 


Acidobocterium sp. 


439 


ZP_07030982 




Bocillus lichenifornnis 


261 


AAP44491 




Dictyoglonnus turgidum 


288 


YP_002352530 




Poenibocillus muciloginosus 


266 


AEI43442 




Spirochoeto thermophilo 


438 


ADN02999 




Spirochoeto thermophilo 


433 


AEJ62362 




Teredinibocter turneroe 


278 


ACR 14297 




Thernnobisporo bisporo 


393 


ADG87082 




Thermotogo nophthophilo 


274 


YP_003346783 




Thermotogo moritimo 


275 


Z69341 




Lysobocter enzymogenes 


383 


ABI54135 




Bocillus megoterium 


345 


ADE69644 




Streptococcus dysgoloctioe 


366 


BAH80742 




Streptococcus dysgoloctioe 


366 


YP_002995956 




Bocillus thuringiensis 


349 


ZP_04083086 


Fungi 


Stochybotrys echinoto 


237 


AF435067 




Aspergillus funnigotus 


378 


EDP50688 




Aspergillus funnigotus 


378 


XP_751495 




Neosortoryo fischeri 


381 


XP_00 1266710 




Aspergillus niger 


396 


XP_00 1400 178 




Penicilliunn morneffei 


379 


XP_002 147625 




Toloromyces stipitotus 


503 


XP_002481822 




Ajellomyces dermotitidis 


357 


XP_002621187 


Planta 


Arobidopsis t hollo no 


484 


BABllOOl 




Tholossiosiro pseudonono 


499 


XP_002287341 


Insect 


Reticulitermes sperotus 


448 


AB019095 



*AII the sequences are downloaded from GenBank (http://www.ncbi.nlm.nih. 
gov/protein/). 



were shown less than 30% (Figure 1, Figure 2). Therefore, 
it was postulated that they may have a common origination 
based on protein evolution. Class II comprises of other 12 
proteins from plant, fungi and bacteria, and class III com- 
prises of 8 proteins from bacteria. 

Analysis of conserved and catalytic amino acid residues 

For the further analysis of the relationship among 19 hy- 
perthermophilic endoglucanases from bacteria and ar- 
chaea, those 19 amino acid sequences were aligned again 
with Clustal X2 (Figure 3). We found that the conserved 
amino acids of hyperthermophilic endoglucanase in Cell2B 
(for instance) include Gly30, Pro63, Pro83, TrpllS, Glul31, 
Metl33, Trpl35, Trpl75, Gly227 and Glu229 which are 
highlighted in red (Figure 3), which is very different from 
the previously reported data [19,20]. Among these con- 
served amino acids, two glutamic acid residues might be 
the catalytic nucleophile and proton donor lil<e lysozyme 
with acid base catalysis [21], other eight conserved amino 
acids might be necessary to the thermostability of protein 
and binding of the substrate. 

Hyperthermophilic protein homology modeling 

All the hyperthermophilic protein sequences were ren- 
dered using SWISS -MODEL database for protein mod- 
eling, but only one good model, Cell2B protein model 
from T. maritima, can be used to describe conserved 
amino acids in which sites of secondary structure and 
enzymatic center of protein. As described with Cell2B 
protein model, Glul31, Glu229, TrpllS, Trpl35, Trpl75 
and Metl33 residues, comprised the active center of the 
protein (Figure 4a). Cell2B protein is primarily com- 
posed of p-sheet (Figure 4a,b,c,d). TrpllS, Glul31, 
Met 133, Trpl35 and Gly227 residues are in the p-sheet; 
Pro63 and Trpl75 residues are in the turn; and Gly30, 
Pro83 and Glu229 residues are in the random coil 
(Figure 4b, d). 

Analysis of site-directed mutagenesis 

Base on the homology modeling, the functional amino 
acid residues Glu64, Pro63, Pro83 and Metl33 of Cell2B 
were selected to be mutated. The results showed that 
the P63K, P83K, M133W, E64H, E64T and E641 mutant 
enzymes dramaticUy inhibited the enzyme activity of 
Cell2B toward CMC-Na, while E64S mutant protein ap- 
parently increased the enzyme activity (Table 3). 

Discussion 

Endoglucanases isolated from hyperthermophilic organ- 
isms are more active and stable at higher temperatures 
than their counterparts from mesophiles. In addition, 
they may be more appropriate for degradation of the cel- 
lulose. Since the enzyme activity of those hyperthermo- 
philic endoglucanases is not high for degradation, the 
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Table 2 The frequencies of nineteen endoglucanases amino acids 





Ala 


Cys 


Asp 


Glu 


Phe 


Gly 


His 


lie 


Lys 


Leu 


Met 


Asn 


Pro 


Gin 


Arg 


Ser 


Thr 


Val 


Trp 


Tyr 


Total 


ADX81754 


4.74 


0.00 


2.55 


3.65 


5.84 


6.20 


0.36 


6.93 


2.19 


6.93 


2.92 


9.49 


7.30 


3.28 


1.82 


7.66 


10.58 


6.93 


4.74 


5.84 


274.00 


ACP37717 


4.74 


0.00 


2.55 


3.65 


5.84 


6.20 


0.73 


6.93 


2.19 


6.93 


2.92 


9.49 


7.30 


2.92 


1.82 


7.66 


10.58 


6.93 


4.74 


5.84 


274.00 


ADX84872 


5.08 


0.00 


2.97 


3.81 


6.36 


7.20 


0.42 


7.20 


2.54 


6.36 


3.39 


9.75 


6.78 


3.39 


2.12 


5.08 


9.32 


6.78 


5.51 


5.93 


236.00 


ACR41545 


5.08 


0.00 


2.97 


3.81 


6.36 


7.20 


0.85 


7.20 


2.97 


6.36 


3.39 


9.75 


6.78 


2.97 


2.12 


5.08 


8.90 


6.78 


5.51 


5.93 


236.00 


AAK42142 


5.51 


0.00 


2.97 


3.39 


6.36 


7.20 


0.42 


7.20 


2.97 


5.51 


3.39 


10.17 


6.78 


3.39 


2.12 


5.08 


8.90 


7.20 


5.51 


5.93 


236.00 


ADM27702 


6.52 


0.72 


6.16 


3.26 


3.62 


8.70 


0.72 


9.42 


2.90 


5.43 


1.45 


6.16 


7.25 


3.62 


4.35 


5.80 


4.71 


8.70 


3.62 


6.88 


276.00 


ADN02999 


5.15 


0.00 


8.46 


5.51 


5.51 


7.72 


0.74 


4.78 


0.74 


6.62 


1.47 


5.51 


5.88 


4.78 


5.51 


6.62 


8.82 


7.72 


4.41 


4.04 


272.00 


AEJ62362 


5.15 


0.00 


8.09 


6.25 


5.51 


7.72 


0.74 


4.04 


0.74 


6.62 


1.10 


5.51 


5.88 


4.41 


5.51 


6.62 


8.82 


8.82 


4.41 


4.04 


272.00 


AF181032 


5.54 


0.00 


4.43 


7.01 


3.32 


7.01 


1.11 


9.59 


4.43 


7.75 


0.74 


7.38 


7.01 


1.85 


2.21 


5.17 


9.59 


6.27 


4.06 


5.54 


271.00 


EEB73588 


6.42 


0.00 


6.04 


8.68 


5.28 


8.68 


1.89 


3.40 


2.64 


7.55 


4.15 


6.04 


6.42 


1.13 


4.15 


4.91 


5.66 


9.43 


3.77 


3.77 


265.00 


YP 003346783 


3.97 


0.40 


6.35 


7.94 


7.14 


6.75 


1.19 


4.37 


6.75 


5.56 


1.98 


5.95 


4.76 


1.98 


1.59 


4.76 


7.94 


10.32 


4.76 


5.56 


252.00 


Z69341 


4.37 


0.40 


6.35 


7.94 


7.14 


6.75 


1.19 


4.37 


6.75 


5.16 


2.38 


5.95 


4.76 


1.98 


1.59 


4.76 


7.54 


10.32 


4.76 


5.56 


252.00 


YP 002352530 


5.43 


0.00 


4.26 


8.53 


4.26 


5.04 


1.16 


9.69 


9.30 


5.81 


1.94 


7.75 


4.65 


1.55 


2.33 


5.43 


5.04 


6.59 


4.26 


6.98 


258.00 


AEA12777 


10.71 


0.40 


4.37 


6.35 


4.76 


7.54 


0.00 


5.16 


3.57 


5.95 


3.97 


3.17 


7.14 


2.78 


3.57 


8.33 


4.76 


6.75 


4.37 


6.35 


252.00 


AAY81158 


2.90 


0.36 


5.07 


2.54 


5.80 


7.97 


0.72 


8.33 


2.90 


7.97 


2.90 


9.78 


3.99 


3.26 


1.45 


7.61 


7.97 


8.70 


2.17 


7.61 


276.00 


AEB95090 


2.89 


0.00 


4.33 


4.33 


5.42 


7.94 


0.36 


6.14 


3.25 


8.66 


4.33 


7.22 


5.42 


2.89 


2.17 


10.11 


5.78 


7.58 


2.53 


8.66 


277.00 


ADN509821 


3.90 


1.42 


2.48 


3.90 


3.19 


7.09 


0.71 


9.22 


3.19 


9.22 


3.19 


11.35 


6.74 


1.42 


1.77 


7.80 


4.61 


6.03 


4.61 


8.16 


282.00 


ADL19785 


3.96 


0.00 


3.24 


3.60 


3.24 


12.59 


0.36 


6.12 


0.72 


11.51 


4.32 


7.19 


5.76 


2.16 


2.88 


7.91 


5.76 


8.63 


3.96 


6.12 


278.00 


CCC81038 


9.13 


1.66 


4.98 


4.56 


3.32 


9.13 


0.41 


2.49 


1.24 


9.96 


1.24 


3.32 


7.47 


2.49 


6.22 


7.88 


4.15 


9.96 


3.32 


7.05 


241.00 


Avg. 


5.28 


0.28 


4.68 


5.18 


5.14 


7.63 


0.74 


6.49 


3.23 


7.19 


2.69 


7.43 


6.20 


2.75 


2.91 


6.59 


7.33 


7.91 


4.24 


6.10 


262.1 1 



hyperthermophilic modification by using genetic engin- 
eering is essential Few structures on databases have 
been reported so far for transforming those enzymes. In 
this paper, nineteen sequences of hyperthermophilic 
endoglucanases were aligned and used for phylogenetic 
tree construction and molecular modeling to illustrate 
the relationship between structure and themostability. 

The features of the nature environment of ancestral or- 
ganism can be inferred by reconstructing phylogenetic 
tree using amino acid sequences of these organisms [22]. 
From the alignment of the amino acids sequences, the 
hyperthermophilic proteins from bacteria and archaea 
are clustered together based on the phylogenetic tree 
(Figure 1). Archaea, known to be an ancient organisms 
on earth, grow in strictly anaerobic environment (ter- 
restrial solfataric springs, hydrothermal areas, and 
deep subsurface oil reservoirs) at high temperature 
(generally above 80°C), and hyperthermophilic bacteria 
also live in the same conditions [13,23]. Therefore, it is 
inferred that endoglucanases from hyperthermophilic 
microorganisms from GHF12 could share the similar 
enzymatic properties and catalytic mechanism. 

The stability of thermophilic proteins depend on several 
amino acid residues and structural factors [24]. Specific 
amino acid composition plays a critical role in the thermo- 
stability of hyperthermophilic endoglucanase, with the 
fewest cysteine and histidine residues that are thermal 



stability among the whole protein sequences by using stat- 
istical comparison of the amino acid composition [25,26], 
Consistent with this feature, the average content of cyst- 
eine and histidine in our reserach is only 0.24 and 0.72 re- 
spectively (Table 2). 

Ten conserved amino acids were found by the align- 
ment of nineteen hyperthermophilic protein sequences 
(Figure 3), that we hypothesize may play a significant 
role in proton donation, substrate binding as well as the 
high thermostability. Among these nineteen amino acid 
sequences, only thethree-dimensional structure of endo- 
glucanase from T. maritima could be obtained (Figure 4), 
since there is no suitable template for other proteins 
homologous modeling. Thus, the relationship between 
the ten amino acid residues of these endoglucanases and 
their molecular structures will be illustrated in Cell2B 
protein from T. maritima. The substitution of non-Gly 
residue with Gly residue can be used as one of the gen- 
eral strategies to enhance the protein stability [27,28]. In 
our study, residues Gly30 and Gly227 located in random 
coil and |3-sheet, respectively, might contribute to the 
thermostability of the protein (Figure 4b,d). 

It is believed that loop and turn are the weak connec- 
tions among the protein secondary structure elements, 
but recently it was demonstrated that they played a key 
role in thermostability of protein, especially for the pro- 
teins that proline is located in loop or turn region [29]. 
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i-C 



h: 



AB019095R.S 

XP_002621187 A. dermatitidis 
XP_001400178A. niger 
XP_00 12667 10 N.fischeri 
EDP50688 A. fumigatus 
XP_751495 A. fumigatus 
XP_002481822 T. stipitatus 
XP_002 147625 P. mameffei 
ZP 07030982 A. sp. 
ABI54135 L. enzymogenes 
BABllOOl A. thaliana 
XP_002287341 T. pseudonana 
ADG87082 T. bispora 
CCC81038T. tenax 
ADL19785 A. saccharovorans 
AA Y81 158 S. acidocaldahus 
AEB95090 M. cuprina 
ADN509821 V. distributa 
AEA12777T. uzoniensis 
EEB73588 T. sp. 
AAD54602 P. furiosus 
AEJ62362 S. thermophila 
ADN02999 S. thermophila 
YP_002352530D. turgidum 
Z69341 T. maritima 
YP_003346783 T. naphthophila 
ADM27702 1. aggregans 
AAK42142 S. solfataricus 
ADX81 754 S. islandicus 
ADX84872 S. islandicus 
ACR41545 S. islandicus 
ACP37717S. islandicus 
ACR14297 T. tumerae ^ 
AF435067 S. echinata 
AAP44491 B. licheniformis 
AEI43442 P. mucilaginosus 
ZP_04083086B. thuringiensis 
ADE69644 B. megaterium 
YP_002995956 S. dysgalactiae 
BAH80742 S. dysgalactiae 



II 



III 




■AB019095 R. speratt^^^^ 

.XP_002621187A. dermatitidis 
.XP_00]400178A. niger 
XPJ01266710 N. fischeri 
EDP50688 A. fumigatus 
aP J 5 1495 A. fumigatus 
.XP_002481822 T. stipitatus 
.XP_002 147625 P. mameffei 
.XP_002287341 T. pseudonana 
AB154135 L. enzymogenes 
ZP_07030982 A. sp. 
.ADG87082 T. bispora 
BABllOOl A. thaliana 
CCC81038 T. tenax 
ADL19785 A. saccharovorans 
.AAY81158 S. acidocaldarius 
AEB95090M. cuprina 
ADN509821 V. distributa 
'AEA12777 T. uzoniensis 
.AAD54602 P. furiosus 
AEJ62362 S. thermophila 
•ADN02999 S. thermophila 
YP_002352530 D. turgidum 
Z69341 T. maritima 
YP_003346783 T. naphthophila 
EEB73588 T. sp. 
>ADM27702 1. aggregans 
.AAK42142 S. solfataricus 
ADX84872 S. islandicus 
ADX81754S. islandicus 
ACR41545S. islandicus 
ACP37717S. islandicus 
ACR14297 T. tumerae 
■AF435067 S. echinata 
.AAP44491 B. licheniformis 
.AE143442 P. mucilaginosus 
■ZP_04083086 B. thuringiensis 
ADE69644 B. megaterium 
YP_002995956 S. dysgalactiae 
BAH80742 S. dysgalactiae 



II 



III 



Figure 1 The phylogenetic tree obtained using thie endoglucanases and outgrouped by thie protein sequence of R. speratus. The NJ 

(a) and MP (b) tree were generated using program PAUP 4.0 beta 10 Win on 40 aligned amino acids. All the protein sequences are from Table 1. 
Proteins from hyperthermophilic bacteria and archaea are shown within light blue colored boxes (I). Other proteins from bacteria, fungi and 
plants are shown within yellow (II) and blue (III) colored boxes. 



Proline in the polypeptide chain possesses less conform- 
ational freedom than other amino acids, as the pyrroli- 
dine ring of proline imposes rigid constrains on the N-C 
rotation and restricts the available conformational space 
of the preceding residue. Therefore it can bend the poly- 
peptide chain on itself so as to prepare the backbone 
much more easily to form the hydrogen bonds with the 
polar side chains of other turns; meanwhile, the hydro- 
phobic part of proline can interact with the adjacent 
hydrophobic cavity [30,31]. Compared to mesophilic 
proteins, thermophilic proteins contain more proline 
residues especially occurring at the turn, with higher fre- 
quency, as well as the shorter loop region of the glucosi- 
dase. As the consequence of the flexibility reduction of 
the polypeptide chain, the protein thermostability can be 
increased by introducing prolines at specific sites based 
on the facts that illustrated above [29,31,32]. Hence, resi- 
dues Pro63 and Pro83, located in the turn and random 



coil respectively (Figure 4c,d), could provide closer pack- 
ing of each region, as assumed for thermostability of 
protein. And then, it was finally confirmed by experi- 
mental results. Compared to other amino acids, lysine 
has longer side-chain groups and more vibrational de- 
gree of freedom, and it is more sensitive to the 
temperature. When the proline is substituted with lysine, 
the vibration of side-chain groups rises up at high 
temperature, and then the thermostability of the Cell2B 
decrease dramatically. Therefore, it is confirmed that 
residues Pro63 and Pro83 play an important role in sta- 
bilizing the Cell2B. 

The crystal structure and protein molecular simulation 
supported that two glutamic acid residues are the cata- 
lytic nucleophile and proton donor that have been re- 
ported in many enzymes, lysozyme, xylanase as well as 
endoglucanase [33]. So, Glul31 (in p-sheet) and Glu229 
(in random coil) residues are the proton donor and 



Shi et al. BMC Structural Biology 2014, 14:8 
httpy/www.biomedcentral.com/l 472-6807/1 4/8 



Page 6 of 10 



83 

100 



81 



ADX81754S. islandicus 
A DX84872S. islandicus 
A CP3771 7S. islandicus 
I ACR41545S. islandicus 
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ADM27702I. aggregans 

EEB73588T.SP 
— AF181032P.furiosus 
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Tool AEJ62362S.themiophila 

YP 002352530D.tungidum 
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- ABA 12777T.uzoniensis 



- ADN509821V.distributa 
99 I 
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-AAY81 158S. acidocaldarius 



- AEB95090M. cuprina 
CCC81038T.tenax 



- ADL19785A.saccharovorans 



0.2 

Figure 2 The ML tree obtained using the 19 endoglucanases amino acids using program MEGA 5. Numbers on nodes correspond to 
percentage bootstrap values for 1000 replicates. 



catalytic nucleophile repectively (Figure 4b,d). Although 
the chemical nature of the tryptophan residue in the 
catalytic center does not significantly affect the conform- 
ational properties of lysozyme, it exhibited a pronounced 
effect on the binding of substrate and the enhancement 
of the total enzyme activity [34]. It was reported that 
structural changes at the active site (W95L) of alcohol 
dehydrogenase from Sulfolobus solfataricus are consist- 
ent with the reduced activity on substrates and de- 
creased coenzyme binding [35]. Therefore, we propose 
that three tryptophan residues (TrpllS, 135 and 175, 
Figure 4b,c) of Cell2B protein may be essential in medi- 
ating the total cooperativity of the response of the en- 
zyme to substrate. Met 133, located in the middle of 
Trpl35 and Glul31 in |3-sheet (Figure 4b), is predicted 
to be related to the binding of substrate and also finally 
confirmed by experimental results. When it is replaced 
by tryptophan residue, the enzyme activity is signifi- 
cantly decreased. With the homology modeling result 
(data not shown), it is inferred that Glu64 is probably 
another functional acid amino located near the catalytic 
center. It is supposed that residue Glu64 might contrib- 
ute to stabilizing the intermediate product. Maintaining 
the intermediate product may be caused by the inter- 
action of side-chain group of Glu64. Polar amino acids, 
histidine and threonine are able to stabilize the inter- 
mediate product to some extent. However, their side- 
chain groups are relatively large, and possess larger 



steric hindrance, thus lead to decrease of the enzyme ac- 
tivity. Compared to glutamic acid, histidine and threo- 
nine, serine has smaller side-chain group and steric 
hindrance, so it can easily form hydrogen bond with 
product and stabilize it, and then increase the enzyme 
activity. 

Conclusions 

Nineteen hyperthermophilic homologous protein sequences 
from GHF12 were aligned and used for constructing phylo- 
genetic tree. It was inferred from the nodes that there is a 
close relationship among these nineteen homologous endo- 
glucanases from hyperthermophilic bacteria and archaea. 
We have made clear the function of these conserved amino 
acids in Cell2B protein, which is helpful in analyzing other 
molecular structure and transforming them with site di- 
rected mutagenesis. 

Methods 

Extraction of sequences from databases 

Thorough BLASTP searches for several divergent endo- 
glucanases of plants, animals, bacteria, fungi, alga and 
archaea were performed to retrieve endoglucanases genes 
through NCBI, PDB (http://www.rcsb.org/pdb/home/home. 
do), UniProt (http://www.uniprot.org/) database server. Hy- 
perthermophilic endoglucanase amino acid sequence was 
used (GenBank No: Z6934) [16] as a BLAST query for seek- 
ing hyperthermophilic endoglucanases from bacteria and 
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epnlwn\tcg-sgsvtmtfddadgfdldvqidlsnlqqedpsgwm\teimik-b^tvgpaqdg--pvplprrlsel 

epnl\\:s\tgg-sgsvtmtfddaegfdldvqidlsnlqqedpsgwraypeiwygik-imvgpaqdg--p\^^^ 

einiv.?^iln;a-tgfaemtynitsgvlhyvqq-ldnivlrdrsniu-hgypeifygnt{-p\\?c-antatdg--piplpsk^^^ — 

eidpv,?vtkra-sgfqrmtydpesgrvefvsx-lsdevlvdpeswhgypevyfgtk-p^\:^-gxsapgf--gvelpvkvsem 

elxfv.?vtksy-egetwlkfdgek--vefyad-lyxivlqxpdsva'hgypeiyygyk-pwaghxsgve flpvkvkdl 

-vefyad-lyxivlqxtdsv.thgypeiyygyk-pwaghxsgve flpvkvkdl 



IKXIILKDKXSV.THGYPEVYYGYK-PWSAHGXSIEK LVLPRKVLEF 



ELXT\OT'KSY- 
ELXTOIAKY- 

EVX'M\raKTV;-XGXYTM\TXPLTRTLSVSFX LTQVXPLQWTXGYPEIYVGRK-PWDTSYAG XIFPMRIGX'M 

EVX-^W-XAKTW- 
EVX'M\'.A'AKTW- 



-EGETWLKFDGEK- 
-EGXTV.mFYKEEDAVEYYAD- 
-XGXYTM\TXPLTRTLSVSFX- 
-XGXYT.WFXTLTRTLSVSFX- 
-XGXYTM\TXPLTRTLSVSFX- 
EVX-M\'.'XAKTW-XGXYTM\TXPLTRKLSVSFX- 



-LTQVXPLQWTXGYPEIYVGRK-PWDTSYAG XIFPMRIGX-M- 

-LTQ\'XTLQWTXGYPEIYVGRK-PWDTSYAG XIFPMRIGXM- 

-LTQVXPLQWTXGYPEIYVGRK-PWDTSYAG XIFPMRIGX-M- 



QIX-MW^^'IKSA-SGTAAMRYCDGVFYYEQALK- 
SPFLVf-XTKEG-EGgVIMXTS-XYLKWIX-MS- 
SPFL\\:sLKTA-LGYTXLTYRGXTLVVX\^XTT- 



EVX-M\raKTW-XGXYTM\TXTLTRTLSVSFX LTQVXPLQWTXGYPEIYVGRK-PWDTSYAG XIFPMRIGX'M 

SLXLYGV-XSA-LGYQRmiYIHXLTIKIVSD LHSIQPVQV.TXGYPEIYVGRK-PWDTRYIDGYG VAFPIXTDXP 

— DIAEAXPDAWVAGYPEIWLGYK-PWAGAASPXS PFPIKISDAES— 

-KVXTvITPSIPVDGYPGLMYGRE-MWFPFVAQTET-YRLXLPEIVXDL 

-XTEKIXSXLQVDGYPG\'MYGQE-DWFPFAGRTLMPSCFVLPVKVISL 

YIX'\W?vlAX7SKGYAK\ITYXTXKGVLCFYAL-LSX;AVLKSPESQ\lVGYPEIFLVGKSPWFKSPVX--E--n 

APYPmKEW-TGVTQVAYDGREVTASVX'MS YTAKFXPYAPVLGYPSVRYGCDPLFYYCSGR.AQP LELPEPASLAGG— 

SPMLWNLVGG-TGNTTM\^LSGGRLY\'YIXTT GVSR.AIKFTPVVGFPDIMYGWY-SWGPFYTRTSSYGFLSLPMPASEV 

30 63 83 

XDFYTTVDFSIQRLDPQLPFXFPFETWLTR DTSRGRDVRSDEVEIMIWFXYYGLQGAGSQVD TLTVPIEVXGQMRD 

XDFYTTVDFSIQRLDPELPFXFPFETWLTR DTSRGRDVRSDEVEimWXYYGLQGAGSQVD TLTVPIEVXGQTRD 

TDFYLTISYKLEPKX-GLPIXFAIESWLTR EAWRTTGIXSDEQE\"\IIWIYYDGLQPAGSKVK EIVVPII\:\GTPVX 

RHFLVHVEYSIXLTD-PIPFXLAMETWLTK EKXRSTGVFPGEAEINCTLYYSXLTPAGEKIG EAmTLLVXGSLVX 

PDFYVTLDYSIWYEX-XLPIXLAMETWITR SPDQTSVSSGDAEINCTFYX-XTLMPGGQKVD EFTTTVEIXGVKQE 

PDFYVTLDYSIWYEX-XLPIXLAMETWITR SPDQTSVSSGDAEINOTTYX-XTLMPGGQKVD EFTTTVEIXGVKQE 

PDVLFXLKYXIWYER-XLSIXFAMETWITK EPYQKTVTAGDIEMMMVLYAXRLSPAGRKVA EVKIPIILXGXQKD 

TPFM\'SFYIXLTKLDPSIXFDIASDAWIVRPQIAFSPGTAPGXGDIEim'WlFSQXLQPAGQQVG ELVIPIYIXHTLVX 

TPF-WSFYIXLTKLDPSIX7DIASDAWIVRPQIAFSPGTAPGXGDIEIM\'WLFSQXLHPAGQQVG ELVIPIYIXHTL\^' 

TPFM\'^SFYIXLTKLDPSIXFDIASDAWIVRPQIAFSPGTAPGXGDIEIM\"W1FSQXLQPAGQQVG ELVIPIYIXHTLVX 

TPFM\'^SFYIXLTKLDPSIXFDIASDAWIVRPQIAFSPGTAPGXGDIEIM\'W1FSQXLHPAGQQVG ELVIPIYIXHTLVX' 

TPFMV'^SFYIX'LTKLDPSIXTDIASDAWIVRPQIAFSPGTAPGX'GDIEBCTLFSQX'LQPAGQQVG EVVIPIYIX'HTLVX 

RQF\TSFYVCIEDLDPTMX'FXIAADAWIVRESVAR.APGTPPGKGDIEIM\'WLFSQX'LGPAGDRVG EEIIPIVIXGTRID 

SXTTISVDYSVEVPDPTLPLDFAFDLWTR STGERSVGQGEQEIMIWLYYQQLMPAGEKVG EVRIPLWXGSPAE 

PSFYSILXYSIFVX'E-GTVDDFSYDIWLSQ XTXITYLKYGDFEVMIWLYWHEXFSSDKYMIYT GEMTIPVEVXGTFEP 

PXTX'STLSYKIXDXR-GIIDDFSYDIWLTQ X'PXTTYIQFPDVEIMIWLYHXETLS— DYFVLA G\^1SVXIM\'X'GTVIQ 

PXLGIYVXYTLIHSP-STPMDWAYDIWFLR DPXTTGVGPGDAEMIWLYYSGYXTQWAYTGI XTXIPIYVX'GTLIX 

VLVLSYSVSPGD-CXTTDFSYDIWFXR GGFRLGAGDLELMLWLYYXTDPSASLPAPYWRYLGERRLRIAVDGAWQT 

PDLWSm*\^SLSLSR-GAYX'DFSYDIWLVR SPGVTSLGPSDVELMLWWAX'QSLAGLPYWTWR PITMPTLIX'GDIEX 

115 131133135 



MTFEWRSDAV 
MTFEWRSDAV- 

ATFEWK 

ATFEWLDGX- 
TKWDVYFAP— 
TKWDVYFAP— 
IWEVYFSP— 
ATFQWEMKSV 
ATFQWEMKSV 
ATFQWEMKSV 
ATFQWEMKSV 
ATFQWKMKXT 
AKWDVYLQRSV 
AVFYVYRKEG- 
MXTSVYVLPRT 
DDFTVYILPHT 
ETFmiXCX- 
AEASAYVHLS- 
VTYQVFILPRX 



--GX'GGWEYFAFRPT TPISEGTVRFX'WAPFIQLARSLSX'R.A DWEXLYFTSVELGTEFG 

--GX'GGWEYFAFRPT TPVSEGTVRFX'WAPFIQR.ARSLSX'R.A DWEXLYFTSVELGTEFG 

--AXIGWEYVAFRIK TPIKEGTVTIPYGAFISVAAXISSLP X'YTELYLEDVEIGTEFG 

--MGDGWQYMAFMIA EPMREADVTLDPTLFVTAAEXFS-RV DLKX'LYLQDWEMGTEFG 

WGWDYLAFRLT TPNKEGKVKIXTKDFVQLAAEVVKKHSTRI DX'FEELYFCM.'EIGTEFG 

WGWDYLAFRLT TPMKEGKVKIXTKDFVQLAAEVVKKHSTRI DX'FEELYFCV^VEIGTEFG 

GSWDYIAYKSK EXIIQGEVKIPIKDFIKHLRTVIAX'XSSRIT~AEKYDQM\TTWEIGTEFG 

--PWGGWEYIAFRPDG WKVTXGYVAYEPXLFIK-ALX-XTTSY XITX'YYLTDWEFGTEWG 

--PWGGWEYIAFRPDG WKVTXGYVAYEPXLFIK-ALX'XTTSY XITX'YYLTDWEFGTEWG 

-PWGGWEYIAFRPDG WKVTXGYVAYEPXLFIK-ALX'XTTSY XITXTYLTDWEFGTEWG 

--PWGGWEYIAFRPDG WKVTXGYVAYEPXLFIK-ALX'XTTSY XITX'YYLTDWEFGTEWG 

--PWGGWEYIAFRPDG WKVTXGYVAYEPXLFIK-ALX-XTASY XITXTYLTDWEFGTEWG 

--PWGGWDYIAFAPSG WSVRCGSVAYDPTLFIQ-AAKKYVS MSGVYLLX^WEIGTEWG 

MPWEYIAFALS KPMRSGSVYFRLADFIR.AAAAYTALP XTSDMWLXDVELGSEFG 

-GSADGWTGVYLLAP RXLQG-SVGVPIAYVLX-X-MSPYLSKVKIXIYX TSKYYLDAIQVGMEFX 

-GSSXGWIGVYYISQ LELSAGXITVPMSTLIKDSFXYIRGVFPDLQ TSAYYLXAIQVGMEFX 

--HGAGWTYIAFVP\?v' GGYRXGXIGmAPFLX'YM\TLLPQKCPSIWKSPDXTSX'LWLMDIELGSEFG 

— QGSWSVLVFVLR RPVRSGTVAVDLGQLVR.AAQETLX'GTP IDLERLXLTSIDVGMEFD 

-GGPSGW^ILIILIPEOTSGGQYHGLLKGEYGVX'LGELMXETLXIIGEFXGTR WEQGLYLSVIQLGAEVD 

175 227 229 



Figure 3 Alignment of 19 endoglucanases amino acids sequences using CLUSTAL X2.0. The highly conserved amino acids are colored 
in red. 



archaea. New rounds of BLASTP searches for the nr protein 
and GenBank databases at NCBI restricted to plant or other 
organisms were carried out using representative endogluca- 
nase from different classes of plants, bacteria, fungi and alga 
as queries. 



Multiple sequence alignment and phylogenetic analysis 

One of the most widely used bioinformatics analysis is 
multiple sequences alignment, and it needs several 
widely used software packages to analysis. In this study, 
the multiple sequence alignment tool Clustal X2 was 
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Figure 4 Structure modeling of the protein Cel12B. Different segments of the protein secondary structure are colored accordingly. The 
catalytic amino acids (GlulSl and Glu229) locating in the center of the structure were labeled in red (a, b, d). The amino acids Trpl 15, Trpl35 
and Trpl 75 were labeled in magenta (a, b, c), Metl33 was labeled in blue (a, b), where these four amino acids show a great importance in the 
substrate binding. The amino acids Pro63 and Pro83 were labeled in black (a, c, d), Gly30 and Gly227 were labeled in cyan (a, b, d), where these 
four amino acids are well related to the thermostability of the enzyme. 



used for sequence alignment [36]. Sequences were further 
edited using the MEGA 5 when necessary and aligned 
manually [37]. In the phylogenetic analysis, sequences 
were trimmed so that only the relevant conserved do- 
mains were remained in the alignment. Phylogenetic rela- 
tionships were inferred using the NJ and MP methods as 

Table 3 Effect of site-directed mutagenesis on 
enzyme activity 



Strain 



Optimum 
temp rC) 



Specific activity 
(U mg ') 



Relative 
activity (%) 



Control 


90 


105 ±3.4 


100 ±3.2 


E64T 


85 


53 ±1.3 


50±1.2 


E64H 


85 


25 ± 1 .0 


24± 1.0 


E64L 


ND 


0 


0 


E64S 


90 


133 ±2.5 


127 ±2.4 


P63K 


ND 


0 


0 


P83K 


ND 


0 


0 


M133W 


ND 


0 


0 



ND: not determined. Values shown were the mean of triplicate experiments. 



implemented in PAUP 4.0 [18] while the Maximum- 
Likelihood method as implemented in MEGA 5 [37]. The 
NJ, MP and ML trees, displayed using TREEVIEW 1.6.6 
(http://taxonomy.zoology.gla.ac.uk/rod/treeview.html), were 
evaluated with 1000 bootstrap replicates. 

Secondary structure prediction 

For homology modeling, the crystal structure of the 
thermophilic endoglucanase (PDB ID: 3AAM) obtained 

Table 4 Nucleotide sequences of used primers 

Primers Nucleotide sequence 

Forward 1 5-AGTAGATNNNTGGATATCCATGCACCCAGC -3' 

Reverse 1 S'-ACGGTOCAAGCCCTGGGCG -3' 

Forward 2 5'- CATGGATATAAGGAGATCTACTACGGTOCAAG -3' 

Reverse 2 5 -CACCCAGCTGTCTGG ATOTG AAG -3 ' 

Forward 3 S'-GAAmcmAGCTGAAGGTGAAAGATCTCC -3' 

Reverse 3 5 '-AACACCGCTGTOTGCCCCG -3 ' 

Forward 4 5-CGGAGATCTGGGmGGTOTACAACAACGTO-3' 

Reverse 4 5 '-CGTCACCCG AAG AAACAG AGGTC -3 ' 
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from Protein Data Bank (PDB) was used as a template. 
The aligned sequences were submitted to SWISS- 
MODEL (http://www.expasy.org/swissmod/) to obtain 
the 3D structure of the endoglucanases [38-40]. The 
model was viewed using Swiss-PDB Viewer [41], and 
the quality of the model was evaluated by the local 
model quality estimation on SWISS-MODEL. The 3D 
structure of the protein was further modified by 
PyMOL (version 1.4.1, http://www.pymol.org/). 

Test of functional residues 

Site-directed mutagenesis was used to analyze the related 
functional amino acid residues using reverse PGR. Restric- 
tion enzymes, DNA polymerase, Dpnl, T4 polynucleotide 
kinase and T4 ligase were purchased from Takara (Dalian, 
China) and used according to the manufacturers instruc- 
tions. The sequence of cell2B gene (GenBank Protein No. 
Z69341) based on the T. maritima genomic DNA was amp- 
lified using primers 5'-GGAATTCCATATGAGGTGGG- 
CAGTTCTTCTGA-3', and 5'-CCGCTCGAGTTATTACT 
CGAGTTTTACACCTTCGACAG AG AAGTC-3 ' (primers 
with the added compatible restriction sites of Ndel and 
Xholy respectively). PGR was performed as follows: 94°C, 
5 min; 30 cycles of 94°C for 30 s, 55°C for 30 s and 72°C for 
50 s; and 72°C, 10 min. The recombinant vector was con- 
structed as follows: the amplified PGR products were puri- 
fied, digested with Ndel and X/zoI, and then ligated into 
pET-20b vector at the corresponding sites. Reverse PGR am- 
plifications were conducted by high-fidelity Pyrobest DNA 
polymerase using recombinant pET-20b-ce/72^ as tem- 
plates, and primers were shown in Table 4. The templates 
were cleaned away from the products using DpnL Then, the 
resulting products were purified with BIOMIGA PGR Purifi- 
cation Kit (Shanghai, Ghina), followed by phosphorylation 
using T4 polynucleotide kinase and finally ligated with T4 
ligase. DNA sequencing was performed with ABI 3730 (Ap- 
plied Biosystems, USA). 

E. coli BL21 (DE3) cells harboring recombinants were 
grown at 37°G and 200 rpm in 200 mL of Luria-Bertani 
(LB) with appropriate antibiotic selection. When the 
ODeoo reached 0.6-0.8, the expression of mutated en- 
zymes were induced by the addition of 0.5 mM isopropyl 
|3-D-l-thiogalactopyranoside (IPTG) and the culture was 
incubated at 37°G and 200 rpm for 5 h. Gells were har- 
vested by centrifugation at 4°C (10000 rpm, 5 min), 
washed twice with 20 mM Tris-HGl buffer (pH 8.0), and 
re-suspended in 5 mL of 5 mM imidazole, 0.5 M NaGl, 
and 20 mM Tris-HGl buffer (pH 7.9). All subsequent 
steps were carried out at 4°G. The cell extracts after son- 
ication were heat treated at 50°G for 30 min, cooled in 
an ice bath, and then centrifuged (15000 g, 4°G, 20 min). 
The resulting supernatants were loaded onto a 1 ml Ni^"^ 
affinity column (Novagen, USA) and the bounded pro- 
teins were eluted by discontinuous imidazole gradient. 



Enzyme activity was determined using 5-dinitrosalicylic 
acid (DNS) method [42]. The reaction mixture, containing 
50 mM imidazole-potassium buffer (pH 6.0), 0.5% sodium 
carboxymethyl cellulose (GMG-Na), and a certain amount 
of endoglucanase (0.1 (ig) in 0.2 mL, was incubated for 
10 min at 85°G. The reaction was stopped by the addition 
of 0.3 mL DNS. The absorbance of the mixture was mea- 
sured at 520 nm. One unit of enzyme activity was defined 
as the amount of enzyme necessary to liberate 1 (imol of 
reducing sugars per min under the assay conditions. All 
the values of enzymatic activities shown in figures were 
averaged from three replicates. 
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